Goto

Collaborating Authors

 house price prediction


On the Performance of LLMs for Real Estate Appraisal

Geerts, Margot, Reusens, Manon, Baesens, Bart, Broucke, Seppe vanden, De Weerdt, Jochen

arXiv.org Artificial Intelligence

The real estate market is vital to global economies but suffers from significant information asymmetry. This study examines how Large Language Models (LLMs) can democratize access to real estate insights by generating competitive and interpretable house price estimates through optimized In-Context Learning (ICL) strategies. We systematically evaluate leading LLMs on diverse international housing datasets, comparing zero-shot, few-shot, market report-enhanced, and hybrid prompting techniques. Our results show that LLMs effectively leverage hedonic variables, such as property size and amenities, to produce meaningful estimates. While traditional machine learning models remain strong for pure predictive accuracy, LLMs offer a more accessible, interactive and interpretable alternative. Although self-explanations require cautious interpretation, we find that LLMs explain their predictions in agreement with state-of-the-art models, confirming their trustworthiness. Carefully selected in-context examples based on feature similarity and geographic proximity, significantly enhance LLM performance, yet LLMs struggle with overconfidence in price intervals and limited spatial reasoning. We offer practical guidance for structured prediction tasks through prompt optimization. Our findings highlight LLMs' potential to improve transparency in real estate appraisal and provide actionable insights for stakeholders.


Unveiling Location-Specific Price Drivers: A Two-Stage Cluster Analysis for Interpretable House Price Predictions

Gümmer, Paul, Rosenberger, Julian, Kraus, Mathias, Zschech, Patrick, Hambauer, Nico

arXiv.org Artificial Intelligence

House price valuation remains challenging due to localized market variations. Existing approaches often rely on black-box machine learning models, which lack interpretability, or simplistic methods like linear regression (LR), which fail to capture market heterogeneity. To address this, we propose a machine learning approach that applies two-stage clustering, first grouping properties based on minimal location-based features before incorporating additional features. Each cluster is then modeled using either LR or a generalized additive model (GAM), balancing predictive performance with interpretability. Constructing and evaluating our models on 43,309 German house property listings from 2023, we achieve a 36% improvement for the GAM and 58% for LR in mean absolute error compared to models without clustering. Additionally, graphical analyses unveil pattern shifts between clusters. These findings emphasize the importance of cluster-specific insights, enhancing interpretability and offering practical value for buyers, sellers, and real estate analysts seeking more reliable property valuations.


Demo2Vec: Learning Region Embedding with Demographic Information

Wen, Ya, Zhou, Yulun

arXiv.org Artificial Intelligence

Demographic data, such as income, education level, and employment rate, contain valuable information of urban regions, yet few studies have integrated demographic information to generate region embedding. In this study, we show how the simple and easy-to-access demographic data can improve the quality of state-of-the-art region embedding and provide better predictive performances in urban areas across three common urban tasks, namely check-in prediction, crime rate prediction, and house price prediction. We find that existing pre-train methods based on KL divergence are potentially biased towards mobility information and propose to use Jenson-Shannon divergence as a more appropriate loss function for multi-view representation learning. Experimental results from both New York and Chicago show that mobility + income is the best pre-train data combination, providing up to 10.22\% better predictive performances than existing models. Considering that mobility big data can be hardly accessible in many developing cities, we suggest geographic proximity + income to be a simple but effective data combination for region embedding pre-training.


A Multi-Modal Deep Learning Based Approach for House Price Prediction

Hasan, Md Hasebul, Jahan, Md Abid, Ali, Mohammed Eunus, Li, Yuan-Fang, Sellis, Timos

arXiv.org Artificial Intelligence

Accurate prediction of house price, a vital aspect of the residential real estate sector, is of substantial interest for a wide range of stakeholders. However, predicting house prices is a complex task due to the significant variability influenced by factors such as house features, location, neighborhood, and many others. Despite numerous attempts utilizing a wide array of algorithms, including recent deep learning techniques, to predict house prices accurately, existing approaches have fallen short of considering a wide range of factors such as textual and visual features. This paper addresses this gap by comprehensively incorporating attributes, such as features, textual descriptions, geo-spatial neighborhood, and house images, typically showcased in real estate listings in a house price prediction system. Specifically, we propose a multi-modal deep learning approach that leverages different types of data to learn more accurate representation of the house. In particular, we learn a joint embedding of raw house attributes, geo-spatial neighborhood, and most importantly from textual description and images representing the house; and finally use a downstream regression model to predict the house price from this jointly learned embedding vector. Our experimental results with a real-world dataset show that the text embedding of the house advertisement description and image embedding of the house pictures in addition to raw attributes and geo-spatial embedding, can significantly improve the house price prediction accuracy. The relevant source code and dataset are publicly accessible at the following URL: https://github.com/4P0N/mhpp


Boosting House Price Estimations with Multi-Head Gated Attention

Sellam, Zakaria Abdellah, Distante, Cosimo, Taleb-Ahmed, Abdelmalik, Mazzeo, Pier Luigi

arXiv.org Artificial Intelligence

Evaluating house prices is crucial for various stakeholders, including homeowners, investors, and policymakers. However, traditional spatial interpolation methods have limitations in capturing the complex spatial relationships that affect property values. To address these challenges, we have developed a new method called Multi-Head Gated Attention for spatial interpolation. Our approach builds upon attention-based interpolation models and incorporates multiple attention heads and gating mechanisms to capture spatial dependencies and contextual information better. Importantly, our model produces embeddings that reduce the dimensionality of the data, enabling simpler models like linear regression to outperform complex ensembling models. We conducted extensive experiments to compare our model with baseline methods and the original attention-based interpolation model. The results show a significant improvement in the accuracy of house price predictions, validating the effectiveness of our approach. This research advances the field of spatial interpolation and provides a robust tool for more precise house price evaluation. Our GitHub repository.contains the data and code for all datasets, which are available for researchers and practitioners interested in replicating or building upon our work.


An Optimal House Price Prediction Algorithm: XGBoost

Sharma, Hemlata, Harsora, Hitesh, Ogunleye, Bayode

arXiv.org Artificial Intelligence

An accurate prediction of house prices is a fundamental requirement for various sectors including real estate and mortgage lending. It is widely recognized that a property value is not solely determined by its physical attributes but is significantly influenced by its surrounding neighbourhood. Meeting the diverse housing needs of individuals while balancing budget constraints is a primary concern for real estate developers. To this end, we addressed the house price prediction problem as a regression task and thus employed various machine learning techniques capable of expressing the significance of independent variables. We made use of the housing dataset of Ames City in Iowa, USA to compare support vector regressor, random forest regressor, XGBoost, multilayer perceptron and multiple linear regression algorithms for house price prediction. Afterwards, we identified the key factors that influence housing costs. Our results show that XGBoost is the best performing model for house price prediction.


How Shapley Values Work

#artificialintelligence

Although standard Shapley values are largely obsolete to those produced by SHAP, the theory carries over, so it's still useful to understand. I'll explain how SHAP works in a future post, so subscribe if that's something you'd like to see.


House Price Prediction using Machine Learning in Python - GeeksforGeeks

#artificialintelligence

We all have experienced a time when we have to look up for a new house to buy. But then the journey begins with a lot of frauds, negotiating deals, researching the local areas and so on. So to deal with this kind of issues Today we will be preparing a MACHINE LEARNING Based model, trained on the House Price Prediction Dataset. As we have imported the data. So shape method will show us the dimension of the dataset.


Project 3 - House Prices Prediction using Machine Learning with Python

#artificialintelligence

I identify as a Machine Learning and Artificial Intelligence enthusiast who is quite fond of making use of data and patterns to develop insights and analysis. In my opinion, Machine learning is where the computational and algorithmic skills of data science meets the statistical thinking of data science. The result is a collection of approaches that requires effective theory as much as effective computation. There are a plethora of machine learning model, with each of them working best for different problems. As such, I believe understanding the problem setting in machine learning is essential to using these tools effectively.


House Price Prediction using a Random Forest Classifier

#artificialintelligence

In this blog post, I will use machine learning and Python for predicting house prices. I will use a Random Forest Classifier (in fact Random Forest regression). In the end, I will demonstrate my Random Forest Python algorithm! There is no law except the law that there is no law. Data Science is about discovering hidden patterns (laws) in your data.